A transaction is a way to group several reads or writes of data into a logical unit. It succeeds (commit) or fails (rollback, abort). Through transactions, databases provide safety guarantees.

ACID

Implementing ACID

Classes of DB concurrency issues

  1. Dirty reads: when one transaction is able to see data written by another that has not yet been committed.
  2. Dirty writes: when one transaction overwrites an uncommitted value from a concurrent, uncommitted, transaction.
  3. Read skew: when a transaction reads different values for the same data during its execution because other concurrent transactions have modified it.
  4. Lost update: occurs if an application reads some value from the DB, modifies it, and writes back the modified value. If two transactions do this concurrently, one of the modifications can be lost.
    • some databases provide atomic update operations to solve this
  5. Write skew: caused by updates to separate records in a database that none-the-less cause a conflict. Occurs if two transactions read the same objects, and then update some of those objects. General case of dirty write or read skew.
  6. Phantoms: another specific type of write skew, where a write in one transaction changes the result of a search query in another transaction.

Isolation Levels

Read Committed

Default setting in many databases (e.g., PostgreSQL)

Two guarantees:

  1. When reading, you will only see committed data (no dirty reads).
  2. When writing, you will only overwrite data that has been committed (no dirty writes).

Dirty writes are most commonly prevented with row-level locks.

Dirty reads are prevented by remembering both the old committed value and the new value set by an in progress transaction. Reads are given the remembered old value until the writing transaction is committed.

Snapshot Isolation

Read committed guarantees + no read skew.

Each transaction reads from a consistent snapshot of the database. It sees all data as it existed in the DB at the start of the transaction.

Serializable Isolation

Sources

  1. Designing Data Intensive Applications, Martin Kleppmann